178 research outputs found

    A Parameterisation of Algorithms for Distributed Constraint Optimisation via Potential Games

    No full text
    This paper introduces a parameterisation of learning algorithms for distributed constraint optimisation problems (DCOPs). This parameterisation encompasses many algorithms developed in both the computer science and game theory literatures. It is built on our insight that when formulated as noncooperative games, DCOPs form a subset of the class of potential games. This result allows us to prove convergence properties of algorithms developed in the computer science literature using game theoretic methods. Furthermore, our parameterisation can assist system designers by making the pros and cons of, and the synergies between, the various DCOP algorithm components clear

    Learn While You Earn: Two Approaches to Learning Auction Parameters in Take-it-or-leave-it Auctions

    No full text
    Much of the research in auction theory assumes that the auctioneer knows the distribution of participants ā€™ valuations with complete certainty. However, this is unrealistic. Thus, we analyse cases in which the auctioneer is uncertain about the valuation distributions; specifically, we consider a repeated auction setting in which the auctioneer can learn these distributions. Using take-it-or-leave-it auctions (Sandholm and Gilpin, 2006) as an exemplar auction format, we consider two auction design criteria. Firstly, an auctioneer could maximise expected revenue each time the auction is held. Secondly, an auctioneer could maximise the information gained in earlier auctions (as measured by the Kullback-Liebler divergence between its posterior and prior) to develop good estimates of the unknowns, which are later exploited to improve the revenue earned in the long-run. Simulation results comparing the two criteria indicate that setting offers to maximise revenue does not significantly detract from learning performance, but optimising offers for information gain substantially reduces expected revenue while not producing significantly better parameter estimates

    A Distributed Algorithm for Demand Response with Mixed-Integer Variables

    Full text link
    This letter presents a fast distributed algorithm for aggregating a large number of households with mixed-integer variables and intricate couplings between devices. The proposed fast distributed gradient algorithm is applied to the double smoothed dual function of the adopted DR model. The results also show that, with minimal parameter adjustments, the convergence of the dual objective exhibits the same behavior irrespective of the system size.Comment: 2 pages, 1 figure, to be published in IEEE Transactions on Smart Grid Letter

    Control of large distributed systems using games with pure strategy nash equilibria

    No full text
    Control mechanisms for optimisation in large distributed systems cannot be constructed based on traditional methods of control because they are typically characterised by distributed information and costly and/or noisy communication. Furthermore, noisy observations and dynamism are also inherent to these systems, so their control mechanisms need to be flexible, agile and robust in the face of these characteristics. In such settings, a good control mechanism should satisfy the following four design requirements: (i) it should produce high quality solutions, (ii) it should be robustness and flexibility in the face of additions, removals and failures of components, (iii) it should operate by making limited use of communication, and (iv) its operation should be computational feasible. Against this background, in order to satisfy these requirements, in this thesis we adopt a design approach based on dividing control over the system across a team of selfā€“interested agents. Such multiā€“agent systems (MAS) are naturally distributed (matching the application domains in question), and by pursing their own private goals, the agents can collectively implement robust, flexible and scalable control mechanisms. In more detail, the design approach we adopt is (i) to use games with pure strategy Nash equilibria as a framework or template for constructing the agentsā€™ utility functions, such that good solutions to the optimisation problem arise at the pure strategy Nash equilibria of the game, and (ii) to derive distributed techniques for solving the games for their Nash equilibria. The specific problems we tackle can be grouped into four main topics. First, we investigate a class of local algorithms for distributed constraint optimisation problems (DCOPs). We introduce a unifying analytical framework for studying such algorithms, and develop a parameterisation of the algorithm design space, which represents a mapping from the algorithmsā€™ components to their performance according to each of our design requirements. Second, we develop a gameā€“theoretic control mechanism for distributed dynamic task allocation and scheduling problems. The model in question is an expansion of DCOPs to encompass dynamic problems, and the control mechanism we derive builds on the insights from our first topic to address our four design requirements. Third, we elaborate a general class of problems including DCOPs with noisy rewards and state observations, which are realistic traits of great concern in realā€“world problems, and derive control mechanisms for these environments. These control mechanism allow the agents to either learn their reward functions or decide when to make observations of the worldā€™s state and/or communicate their beliefs over the state of the world, in such a manner that they perform well according to our design requirements. Fourth, we derive an optimal algorithm for computing and optimising over pure strategy Nash equilibria in games with sparse interaction structure. By exploiting the structure present in many multi-agent interactions, this distributed algorithm can efficiently compute equilibria that optimise various criteria, thus reducing the computational burden on any one agent and operating using less communication than an equivalent centralised algorithms.For each of these topics, the control mechanisms that we derive are developed such that they perform well according to all four f our design requirements. In sum, by making the above contributions to these specific topics, we demonstrate that the general approach of using games with pure strategy Nash equilibria as a template for designing MAS produces good control mechanisms for large distributed systems

    Decentralised Dynamic Task Allocation Using Overlapping Potential Games

    No full text
    This paper reports on a novel decentralised technique for planning agent schedules in dynamic task allocation problems. Specifically, we use a stochastic game formulation of these problems in which tasks have varying hard deadlines and processing requirements. We then introduce a new technique for approximating this game using a series of static potential games, before detailing a decentralised method for solving the approximating games that uses the distributed stochastic algorithm. Finally, we discuss an implementation of our approach to a task allocation problem in the RoboCup Rescue disaster management simulator. The results show that our technique performs comparably to a centralised task scheduler (within 6% on average), and also, unlike its centralised counterpart, it is robust to restrictions on the agentsā€™ communication and observation ranges

    Knapsack based Optimal Policies for Budget-Limited Multi-Armed Bandits

    Full text link
    In budget-limited multi-armed bandit (MAB) problems, the learner's actions are costly and constrained by a fixed budget. Consequently, an optimal exploitation policy may not be to pull the optimal arm repeatedly, as is the case in other variants of MAB, but rather to pull the sequence of different arms that maximises the agent's total reward within the budget. This difference from existing MABs means that new approaches to maximising the total reward are required. Given this, we develop two pulling policies, namely: (i) KUBE; and (ii) fractional KUBE. Whereas the former provides better performance up to 40% in our experimental settings, the latter is computationally less expensive. We also prove logarithmic upper bounds for the regret of both policies, and show that these bounds are asymptotically optimal (i.e. they only differ from the best possible regret by a constant factor)
    • ā€¦
    corecore